Search CORE

11 research outputs found

Gaussian Mixture Regression model with logistic weights, a penalized maximum likelihood approach

Author: Cohen Serge
Montuelle Lucie
Pennec Erwan Le
Publication venue
Publication date: 01/04/2013
Field of study

We wish to estimate conditional density using Gaussian Mixture Regression model with logistic weights and means depending on the covariate. We aim at selecting the number of components of this model as well as the other parameters by a penalized maximum likelihood approach. We provide a lower bound on penalty, proportional up to a logarithmic term to the dimension of each model, that ensures an oracle inequality for our estimator. Our theoretical analysis is supported by some numerical experiments

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL UVSQ

Statistical learning for wind power : a modeling and stability study towards forecasting

Author: Fischer Aurélie
Montuelle Lucie
Mougeot Mathilde
Picard Dominique
Publication venue: 'Wiley'
Publication date: 07/09/2017
Field of study

We focus on wind power modeling using machine learning techniques. We show on real data provided by the wind energy company Ma{\"i}a Eolis, that parametric models, even following closely the physical equation relating wind production to wind speed are outperformed by intelligent learning algorithms. In particular, the CART-Bagging algorithm gives very stable and promising results. Besides, as a step towards forecast, we quantify the impact of using deteriorated wind measures on the performances. We show also on this application that the default methodology to select a subset of predictors provided in the standard random forest package can be refined, especially when there exists among the predictors one variable which has a major impact

arXiv.org e-Print Archive

Hal-Diderot

Inégalités d'oracle et mélanges

Author: Montuelle Lucie
Publication venue: HAL CCSD
Publication date: 04/12/2014
Field of study

This manuscript focuses on two functional estimation problems. A non asymptotic guarantee of the proposed estimator’s performances is provided for each problem through an oracle inequality.In the conditional density estimation setting, mixtures of Gaussian regressions with exponential weights depending on the covariate are used. Model selection principle through penalized maximum likelihood estimation is applied and a condition on the penalty is derived. If the chosen penalty is proportional to the model dimension, then the condition is satisfied. This procedure is accompanied by an algorithm mixing EM and Newton algorithm, tested on synthetic and real data sets. In the regression with sub-Gaussian noise framework, aggregating linear estimators using exponential weights allows to obtain an oracle inequality in deviation,thanks to pac-bayesian technics. The main advantage of the proposed estimator is to be easily calculable. Furthermore, taking the infinity norm of the regression function into account allows to establish a continuum between sharp and weak oracle inequalities.Ce manuscrit se concentre sur deux problèmes d'estimation de fonction. Pour chacun, une garantie non asymptotique des performances de l'estimateur proposé est fournie par une inégalité d'oracle. Pour l'estimation de densité conditionnelle, des mélanges de régressions gaussiennes à poids exponentiels dépendant de la covariable sont utilisés. Le principe de sélection de modèle par maximum de vraisemblance pénalisé est appliqué et une condition sur la pénalité est établie. Celle-ci est satisfaite pour une pénalité proportionnelle à la dimension du modèle. Cette procédure s'accompagne d'un algorithme mêlant EM et algorithme de Newton, éprouvé sur données synthétiques et réelles. Dans le cadre de la régression à bruit sous-gaussien, l'agrégation à poids exponentiels d'estimateurs linéaires permet d'obtenir une inégalité d'oracle en déviation, au moyen de techniques PAC-bayésiennes. Le principal avantage de l'estimateur proposé est d'être aisément calculable. De plus, la prise en compte de la norme infinie de la fonction de régression permet d'établir un continuum entre inégalité exacte et inexacte

Thèses en Ligne

INRIA a CCSD electronic archive server

Oracle inequalities and mixtures

Author: Montuelle Lucie
Publication venue
Publication date: 04/12/2014
Field of study

Ce manuscrit se concentre sur deux problèmes d'estimation de fonction. Pour chacun, une garantie non asymptotique des performances de l'estimateur proposé est fournie par une inégalité d'oracle. Pour l'estimation de densité conditionnelle, des mélanges de régressions gaussiennes à poids exponentiels dépendant de la covariable sont utilisés. Le principe de sélection de modèle par maximum de vraisemblance pénalisé est appliqué et une condition sur la pénalité est établie. Celle-ci est satisfaite pour une pénalité proportionnelle à la dimension du modèle. Cette procédure s'accompagne d'un algorithme mêlant EM et algorithme de Newton, éprouvé sur données synthétiques et réelles. Dans le cadre de la régression à bruit sous-gaussien, l'agrégation à poids exponentiels d'estimateurs linéaires permet d'obtenir une inégalité d'oracle en déviation, au moyen de techniques PAC-bayésiennes. Le principal avantage de l'estimateur proposé est d'être aisément calculable. De plus, la prise en compte de la norme infinie de la fonction de régression permet d'établir un continuum entre inégalité exacte et inexacte.This manuscript focuses on two functional estimation problems. A non asymptotic guarantee of the proposed estimator’s performances is provided for each problem through an oracle inequality.In the conditional density estimation setting, mixtures of Gaussian regressions with exponential weights depending on the covariate are used. Model selection principle through penalized maximum likelihood estimation is applied and a condition on the penalty is derived. If the chosen penalty is proportional to the model dimension, then the condition is satisfied. This procedure is accompanied by an algorithm mixing EM and Newton algorithm, tested on synthetic and real data sets. In the regression with sub-Gaussian noise framework, aggregating linear estimators using exponential weights allows to obtain an oracle inequality in deviation,thanks to pac-bayesian technics. The main advantage of the proposed estimator is to be easily calculable. Furthermore, taking the infinity norm of the regression function into account allows to establish a continuum between sharp and weak oracle inequalities

Theses.fr

Régression gaussienne à poids logistiques et maximum de vraisemblance pénalisé

Author: Le Pennec Erwan
Montuelle Lucie
Publication venue: HAL CCSD
Publication date: 27/05/2013
Field of study

International audienceCette communication s'inscrit dans le cadre général de l'estimation de densités. Nous souhaitons estimer des densités conditionnelles à l'aide de mélanges gaussiens, ce qui revient à estimer les différents paramètres de ces mélanges, ainsi que le nombre de composantes, dépendants d'une covariable. Cette dépendance rend l'estimation des paramètres plus difficile que dans le cadre traditionnel des mélanges gaussiens à paramètres fixes (McLachlan et Peel). Par conséquent, peu de résultats théoriques ont été établis pour des paramètres conditionnés par une covariable. Nous nous sommes concentrés sur des poids logistiques et des moyennes dépendants de la covariable. Les seuls résultats à notre connaissance, correspondant à cette situation, sont de Chamroukhi et al., qui proposent des simulations numériques basées sur l'EM et le critère BIC, avec des poids logistiques affines et des moyennes polynomiales. En nous appuyant sur les outils théoriques fournis par Cohen et le Pennec, nous présenterons une inégalité d'oracle, pour une stratégie de maximum de vraisemblance pénalisé, permettant d'estimer les différents paramètres (variables) du mélange, ainsi que le nombre de composantes. Nous proposerons un choix de pénalités, proportionnel à la dimension du modèle, permettant d'assurer une convergence rapide de l'erreur entre estimateur du maximum de vraisemblance pénalisé et densité cible. Nous illustrerons enfin nos résultats théoriques par des simulations numériques

Hal-Diderot

Mixture of Gaussian regressions model with logistic weights, a penalized maximum likelihood approach

Author: Le Pennec Erwan
Montuelle Lucie
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2014
Field of study

International audienceIn the framework of conditional density estimation, we use candidates taking the form of mixtures of Gaussian regressions with logistic weights and means depending on the covariate. We aim at estimating the number of components of this mixture, as well as the other parameters, by a penalized maximum likelihood approach. We provide a lower bound on the penalty that ensures an oracle inequality for our estimator. We perform some numerical experiments that support our theoretical analysis

Crossref

INRIA a CCSD electronic archive server

HAL-Polytechnique

Régression gaussienne à poids logistiques et maximum de vraisemblance pénalisé

Author: Le Pennec Erwan
Montuelle Lucie
Publication venue: HAL CCSD
Publication date: 27/05/2013
Field of study

INRIA a CCSD electronic archive server

Hal-Diderot

Agrégation PAC-bayésienne d'estimateurs par projection

Author: Le Pennec Erwan
Montuelle Lucie
Publication venue: HAL CCSD
Publication date: 02/06/2014
Field of study

International audienceAggregating estimators using exponential weights depending on their risk performs well in expectation, but sadly not in probability. A way to overcome this issue is considering exponential weights of a penalized risk. In this case, an oracle inequality can be obtained in probability, but is not sharp. Taking into account the estimated function's norm in the penalty offers a sharp inequality.L'agrégation d'estimateur a l'aide de poids exponentiels dépendant de leur risque offre de bonnes performances en moyenne. Malheureusement, il est impossible d'obtenir un aussi bon contrôle du risque de l'estimateur agrégé en probabilité. Pour contourner ce problème, nous considérons des poids exponentiels du risque pénalisé. Cette technique permet d'obtenir une inégalité oracle inexacte en probabilité. En surpénalisant, avec une prise en compte de la norme de la fonction estimée, une inégalité exacte est accessible

INRIA a CCSD electronic archive server

HAL-Polytechnique

Evaluation de la qualité chimique et biologique des cours d'eau : pertinence et validité d'une gamme de techniques d'échantillonnage in situ

Author: Assoumani A.
Bados Philippe
Coquery Marina
Delest B.
Delmas François
Gouy Véronique
Guillemain C.
Lahjiouj F.
Lavieille D.
Liger Lucie
Lissalde S.
Margoum C.
Mazzella Nicolas
Montuelle Bernard
Moreira Sylvia
Morin Soizic
Motte B.
Pesce Stéphane
Publication venue: HAL CCSD
Publication date: 26/10/2011
Field of study

International audienceThis work has mainly targeted the study of various sampling strategies to assess water quality of rivers in relation to agricultural diffuse pollution. For that purpose, a panel of methodologies was implemented to provide additional knowledge on the dynamics of pesticide concentrations in several small rivers. One of our objectives was to compare the results obtained from spot sampling, automated integrated weekly sampling and passive sampling to evaluate the exposure of biofilms to various pesticides. These methods have involved the development of innovative, more reliable and less expensive sampling techniques for in situ estimates of the time-weighted average concentrations of pesticides. Several types of passive samplers were tested: Polar Organic Chemical Integrative Samplers (POCIS) for hydrophilic organic pesticides, Stir Bar Sorptive Extraction (SBSE) for hydrophobic organic pesticides and Diffusion Gradient in Thin-Film (DGT) for metals. First, the implementation of POCIS and SBSE required analytical developments and laboratory calibrations. Then, the three types of tools were deployed in 2009 and 2010 on several sampling sites on the Morcille and Ardières rivers (Beaujolais area near Lyon), as well as in the Ruiné creek, a sub-basin of the Charente River. These sites are characterized by different agricultural, hydrological, physicochemical and geological contexts, which allowed to study the performances and limitations of the different sampling techniques under various conditions. Furthermore, analytical methods for measuring pesticides and metals accumulated in the river biofilms were developed. Hence, we were able to evaluate the bioaccumulated contaminants and their likely impacts on the periphyton and to compare results with the exposure estimates derived from the different sampling techniques

HAL-UNILIM